AITopics

Technology: Information Technology > Artificial Intelligence > Vision (0.95)

Neural Information Processing SystemsNov-20-2025, 16:19:34 GMT

Unsupervised Learning of Shape and Pose with Differentiable Point Clouds

We live in a three-dimensional world, and a proper understanding of its volumetric structure is crucial for acting and planning.

artificial intelligence, machine learning, point cloud, (16 more...)

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceNov-12-2025

Hierarchical Direction Perception via Atomic Dot-Product Operators for Rotation-Invariant Point Clouds Learning

Hu, Chenyu, Li, Xiaotong, Zhu, Hao, Hou, Biao

Point cloud processing has become a cornerstone technology in many 3D vision tasks. However, arbitrary rotations introduce variations in point cloud orientations, posing a long-standing challenge for effective representation learning. The core of this issue is the disruption of the point cloud's intrinsic directional characteristics caused by rotational perturbations. Recent methods attempt to implicitly model rotational equivariance and invariance, preserving directional information and propagating it into deep semantic spaces. Yet, they often fall short of fully exploiting the multiscale directional nature of point clouds to enhance feature representations. To address this, we propose the Direction-Perceptive Vector Network (DiPVNet). At its core is an atomic dot-product operator that simultaneously encodes directional selectivity and rotation invariance--endowing the network with both rotational symmetry modeling and adaptive directional perception. At the local level, we introduce a Learnable Local Dot-Product (L2DP) Operator, which enables interactions between a center point and its neighbors to adaptively capture the non-uniform local structures of point clouds. At the global level, we leverage generalized harmonic analysis to prove that the dot-product between point clouds and spherical sampling vectors is equivalent to a direction-aware spherical Fourier transform (DASFT). This leads to the construction of a global directional response spectrum for modeling holistic directional structures. We rigorously prove the rotation invariance of both operators. Extensive experiments on challenging scenarios involving noise and large-angle rotations demonstrate that DiPVNet achieves state-of-the-art performance on point cloud classification and segmentation tasks. Our code is available at https://github.com/wxszreal0/DiPVNet.

artificial intelligence, machine learning, point cloud, (14 more...)

2511.0824

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceOct-22-2025

A Multimodal Deep Learning Approach for White Matter Shape Prediction in Diffusion MRI Tractography

Lo, Yui, Chen, Yuqian, Liu, Dongnan, Zekelman, Leo, Rushmore, Jarrett, Rathi, Yogesh, Makris, Nikos, Golby, Alexandra J., Zhang, Fan, Cai, Weidong, O'Donnell, Lauren J.

Shape measures have emerged as promising descriptors of white matter tractography, offering complementary insights into anatomical variability and associations with cognitive and clinical phenotypes. However, conventional methods for computing shape measures are computationally expensive and time-consuming for large-scale datasets due to reliance on voxel-based representations. We propose Tract2Shape, a novel multimodal deep learning framework that leverages geometric (point cloud) and scalar (tabular) features to predict ten white matter tractography shape measures. To enhance model efficiency, we utilize a dimensionality reduction algorithm for the model to predict five primary shape components. The model is trained and evaluated on two independently acquired datasets, the HCP-YA dataset, and the PPMI dataset. We evaluate the performance of Tract2Shape by training and testing it on the HCP-YA dataset and comparing the results with state-of-the-art models. To further assess its robustness and generalization ability, we also test Tract2Shape on the unseen PPMI dataset. Tract2Shape outperforms SOTA deep learning models across all ten shape measures, achieving the highest average Pearson's r and the lowest nMSE on the HCP-YA dataset. The ablation study shows that both multimodal input and PCA contribute to performance gains. On the unseen testing PPMI dataset, Tract2Shape maintains a high Pearson's r and low nMSE, demonstrating strong generalizability in cross-dataset evaluation. Tract2Shape enables fast, accurate, and generalizable prediction of white matter shape measures from tractography data, supporting scalable analysis across datasets. This framework lays a promising foundation for future large-scale white matter shape analysis.

artificial intelligence, machine learning, shape measure, (15 more...)

doi: 10.1002/hbm.70396

2504.184

Country: North America > United States (0.69)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.66)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsJan-22-2025, 18:54:40 GMT

Reviews: PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation

Originality: - L3: "to the best of our knowledge, there is no method yet to achieve domain adaptation on 3D data, especially point cloud data" see below [SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud, Wu et al 2018] proposes a domain adaptation pipeline for 3D lidar point cloud to reduce distribution gap between synthetic and real data. Technically the above two are operating in image space (depth semantic segmentation maps, and BEV of point cloud, respectively) but the underlying goal is still to model 3D information from point cloud. The first paper is from 2018 so I do think this paper over claims this'first to do domain adaptation in 3D data' statement a bit. Although it's worth noting that this paper explores point based representation rather than image based, and for classification task rather than point segmentation. But I think the similarity and different should be mentioned and discussed.

domain adaptation, domain adaption network, point cloud representation, (8 more...)

Technology: Information Technology > Artificial Intelligence > Vision (0.78)

arXiv.org Artificial IntelligenceNov-2-2024

TractShapeNet: Efficient Multi-Shape Learning with 3D Tractography Point Clouds

Lo, Yui, Chen, Yuqian, Liu, Dongnan, Legarreta, Jon Haitz, Zekelman, Leo, Zhang, Fan, Rushmore, Jarrett, Rathi, Yogesh, Makris, Nikos, Golby, Alexandra J., Cai, Weidong, O'Donnell, Lauren J.

Brain imaging studies have demonstrated that diffusion MRI tractography geometric shape descriptors can inform the study of the brain's white matter pathways and their relationship to brain function. In this work, we investigate the possibility of utilizing a deep learning model to compute shape measures of the brain's white matter connections. We introduce a novel framework, TractShapeNet, that leverages a point cloud representation of tractography to compute five shape measures: length, span, volume, total surface area, and irregularity. We assess the performance of the method on a large dataset including 1,065 healthy young adults. Experiments for shape measure computation demonstrate that our proposed TractShapeNet outperforms other point-cloud-based neural network models in both the Pearson correlation coefficient and normalized error metrics. We compare the inference runtime results with the conventional shape computation tool DSI-Studio. Our results demonstrate that a deep learning approach enables faster and more efficient shape-measure computation. We also conduct experiments on two downstream language cognition prediction tasks, showing that shape measures from TractShapeNet perform similarly to those computed by DSI-Studio.

artificial intelligence, machine learning, shape measure, (15 more...)

2410.22099

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-9-2024, 19:18:48 GMT

PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation

domain adaption network, point cloud data, point cloud representation, (7 more...)

Technology: Information Technology > Artificial Intelligence > Vision (0.98)

Longhini, Alberta, Welle, Michael C., Erickson, Zackory, Kragic, Danica

AdaFold: Adapting Folding Trajectories of Cloths via Feedback-loop Manipulation

arXiv.org Artificial IntelligenceJul-1-2024

AdaFold extracts a particle-based representation of cloth from RGB-D images and feeds back the representation to a model predictive control to re-plan folding trajectory at every time-step. A key component of AdaFold that enables feedback-loop manipulation is the use of semantic descriptors extracted from geometric features. These descriptors enhance the particle representation of the cloth to distinguish between ambiguous point clouds of differently folded cloths. Our experiments demonstrate AdaFold's ability to adapt folding trajectories to cloths with varying physical properties and generalize from simulated training to real-world execution.

artificial intelligence, machine learning, trajectory, (15 more...)

2403.0621

Country: North America (0.28)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)

arXiv.org Artificial IntelligenceFeb-25-2024

MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding

Yu, Hai-Tao, Song, Mofei

In perception, multiple sensory information is integrated to map visual information from 2D views onto 3D objects, which is beneficial for understanding in 3D environments. But in terms of a single 2D view rendered from different angles, only limited partial information can be provided.The richness and value of Multi-view 2D information can provide superior self-supervised signals for 3D objects. In this paper, we propose a novel self-supervised point cloud representation learning method, MM-Point, which is driven by intra-modal and inter-modal similarity objectives. The core of MM-Point lies in the Multi-modal interaction and transmission between 3D objects and multiple 2D views at the same time. In order to more effectively simultaneously perform the consistent cross-modal objective of 2D multi-view information based on contrastive learning, we further propose Multi-MLP and Multi-level Augmentation strategies. Through carefully designed transformation strategies, we further learn Multi-level invariance in 2D Multi-views. MM-Point demonstrates state-of-the-art (SOTA) performance in various downstream tasks. For instance, it achieves a peak accuracy of 92.4% on the synthetic dataset ModelNet40, and a top accuracy of 87.8% on the real-world dataset ScanObjectNN, comparable to fully supervised methods. Additionally, we demonstrate its effectiveness in tasks such as few-shot classification, 3D part segmentation and 3D semantic segmentation.

information, point cloud, representation, (14 more...)

2402.10002

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceJan-30-2024

Generative Design of Crystal Structures by Point Cloud Representations and Diffusion Model

Li, Zhelin, Mrad, Rami, Jiao, Runxian, Huang, Guan, Shan, Jun, Chu, Shibing, Chen, Yuanping

Efficiently generating energetically stable crystal structures has long been a challenge in material design, primarily due to the immense arrangement of atoms in a crystal lattice. To facilitate the discovery of stable material, we present a framework for the generation of synthesizable materials, leveraging a point cloud representation to encode intricate structural information. At the heart of this framework lies the introduction of a diffusion model as its foundational pillar. To gauge the efficacy of our approach, we employ it to reconstruct input structures from our training datasets, rigorously validating its high reconstruction performance. Furthermore, we demonstrate the profound potential of Point Cloud-Based Crystal Diffusion (PCCD) by generating entirely new materials, emphasizing their synthesizability. Our research stands as a noteworthy contribution to the advancement of materials design and synthesis through the cutting-edge avenue of generative design instead of the conventional substitution or experience-based discovery.

atom, crystal structure, diffusion model, (13 more...)

2401.13192

Country:

Asia > China (0.05)
Europe > Austria > Vienna (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)